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A methodology is developed to obtain subjectively optimum quan- 
tizers for Hadamard-transformed still pictures. To exploit the per- 
ceptual redundancies that depend upon the local properties of the 
picture, a small block (2X2X2, horizontal-vertical-temporal) is used. 
A series of subjective tests was carried out to determine the visibility 
of impairment in the reconstructed picture when noise, which simu- 
lated the quantization noise, was added to the Hadamard coefficients 
in the transform domain (H -noise). A design procedure for quantizers 
was developed using these visibility functions. These quantizers min- 
imize the "mean-square subjective distortion" (MMSSD) due to quan- 
tization noise. The resulting picture quality and entropy were com- 
pared with that of Max-type quantizers which minimize the "mean- 
square error" (MMSE). This comparison indicates that the MMSSD 
quantizers based on subjective visibility of the quantization noise are 
less companded than the MMSE quantizers. Also for the same number 
of quantization levels, pictures coded with MMSSD quantizers have 
better quality and less entropy than the pictures coded with minimum 
mean-square quantizers. 

I. INTRODUCTION 

A methodology is developed in this paper to establish fidelity criteria 
that characterize human observers' perception of noisy transform-coded 
pictures and to obtain optimum quantizers for the transform coefficients 
based on these fidelity criteria. The perceptual effects of impairments 
introduced in the transform domain are, in general, quite different from 
the impairments introduced in the picture domain. Our experiments, 
which are performed with the Hadamard transform of a stationary 
picture, determine the visibility of impairments in the reconstructed 
picture when noise (//-noise), which simulates the quantization noise, 
is added to a Hadamard coefficient. Functions that give the appropriate 
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subjective weighting of the quantization noise as a function of the 
quantity to be quantized are derived from these experiments. These 
functions, called visibility functions, are used in a systematic way to 
design quantizers for PCM and DPCM coding of the transform coeffi- 
cients. These quantizers are compared with minimum mean-square error 
(MMSE) quantizers, both in terms of picture quality and bit rates. 

1. 1 Relationship to previous work 

Considerable attention has been paid to the transform domain in the 
recent work on picture coding. 1 - 14 Transform domain processing has 
several potential advantages. It produces less correlated (but not nec- 
essarily independent) coefficients. It redistributes the image energy so 
that a large amount of energy is packed in a few of the coefficients. 
Moreover, on inverse transformation at the receiver, both noise from 
quantization of coefficients and the channel errors get distributed over 
the block in a manner given by the inverse transform of a particular 
coefficient. 

A number of different transforms have been investigated; among them 
are: Karhunen-Loeve, Fourier, Hadamard, Haar, cosine, and slant 
transform. There have been several attempts 5 - 15,16 to compare the various 
transforms and to find their relative merits for coding of pictures. Almost 
all of these comparisons have been with respect to the following three 
criteria: (i) the correlation between the coefficients, (ii) the mean-square 
approximation error caused by setting some of the coefficients to zero, 
and (Hi) the computational complexity in obtaining transform coeffi- 
cients from picture elements (pels) and vice versa. Perceptual factors 
and the dependence of the picture quality on the particular transform 
and the block size have not received the attention they deserve. 

The irreversible processing of the transform coefficients, which de- 
termines the trade-off between picture quality and bit rate, has been 
performed in a number of ways; for example, (i) zonal sampling or 
masking, which drops some predetermined higher-order coefficients; 
(») threshold sampling, which drops those coefficients whose values are 
below a predetermined threshold (a certain amount of addressing in- 
formation must be sent in using this technique); (Hi) quantization of the 
coefficients— both amplitude (PCM) and differential amplitude (DPCM) 
quantization 9 ' 14 have been considered. Most of the work on quantization 
of the coefficients has centered around minimization of mean-square 
error as a criterion in designing the quantization characteristics. Several 
assumptions on the probability of the coefficients have been made, in- 
cluding the familiar gaussian case, 15 to carry out this minimization. 
Exploitation of the psycho -visual properties of the viewer and the op- 
timization of the quantizers for the best subjective quality of the picture 
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has often been mentioned in the literature; however, no systematic 
methods are available for achieving these. 

Our work on obtaining the quantization characteristics may be com- 
pared with that reported by Landau and Slepian 1 and Tasto and Wintz. 4 
For this reason, we give a brief review of their reports. In both only a 
single frame of picture data is used despite the fact that the quantization 
noise is more visible when a sequence of frames of the same scene is 
coded. 

Landau and Slepian considered both Karhunen-Loeve and Hadamard 
basis vectors for the linear transformation and found that the 
Karhunen-Loeve transformation required solution of an almost de- 
generate eigenvalue problem. They then used Hadamard transformation 
with a 4 X 4 block. The number of quantization levels given to each of 
the first ten coefficients was approximately proportional to the variance 
of that coefficient, and the last six coefficients (H\ \ to H\q) were dropped. 
The first coefficient was quantized by a 64-level uniform quantizer. 
Coefficients H> through /f lfl were quantized with quantizers having a 
companding characteristic given by a function of the form y = k>/x. 

Two arguments led Landau and Slepian to this quantization strategy. 
Firstly, since the variances of the lower coefficients are in general larger, 
coding them more accurately reduced the mean-square error. Secondly, 
the higher coefficients tend to be large in the busy regions of the pictures, 
where the viewers have more tolerance to amplitude errors. Thus, they 
used in an empirical way the consideration of a characteristic of the 
viewer as well as the statistics in the design of quantizers. They carried 
out over 100 experiments in which the decision levels and the repre- 
sentative levels of the quantizers were changed. However, since the 
number of choices is so large, their search could not be exhaustive and, 
therefore, their quantizers are the best only among those that they in- 
vestigated. 

Tasto and Wintz proposed an encoder using a 6 X 6 adaptive 
Karhunen-Loeve transform whose coefficients are quantized by what 
the authors call a "subjectively" optimized system of quantizers. This 
is done by first starting with a quantizer that minimizes the mean-square 
quantization error and then changing it by a trial-and-error procedure 
to obtain the "best" picture quality in the authors' judgment. The "best" 
is again from among those encountered in the trial-and-error procedure. 
They also conducted subjective rating experiments to compare the 
performance of the minimum mean-square quantizers with the "best" 
quantizers. 

1.2 Basic objectives and approach 

Our basic objectives are to obtain fidelity criteria in the transform 
domain which incorporate psycho-visual properties, and to develop 
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systematic methods for the optimum design of coders based on these 
fidelity criteria. As mentioned above, perceptual properties of the human 
viewer have not been given sufficient importance in the transform-coding 
literature and, consequently, good models do not exist to explain the 
subjective effects of the quantization errors in the coefficients when the 
coefficients are inverse transformed to obtain the picture element 

(pel). 

In the pel domain, some efforts 17-19 have been made to measure 
properties of human vision in psychophysical experiments and then 
utilize these to design coders. It is not easy to extend or utilize these 
techniques for the transform domain where we deal with blocks of pels 
instead of one pel at a time. Imperfect reproduction of coefficients of the 
block distributes distortion over the entire block upon inverse trans- 
formation. 

To take advantage of both the perceptual and statistical properties, 
some of the factors one has to study are: 

(0 Spread of the quantization error by inverse transformation. 

(»') Visibility of the quantization error in different coefficients. 

{Hi) Statistical decorrelation. 

(iu) Probability distributions of the coefficients. 

In this paper, we do not attempt to solve this general problem but 
restrict ourselves to nonadaptive coding of stationary pictures using a 
2X2X2 (horizontal-vertical-temporal) Hadamard transform. Although 
a temporal structure of the block is not relevant for still pictures, it will 
be used in the next phase of our work which will treat coding of a se- 
quence of pictures. The Hadamard transformation has been chosen for 
its simplicity in implementation. The objective in choosing a small block 
is to exploit the perceptual redundancies which depend on local prop- 
erties of the picture. The small block ensures that the quantization noise 
can be placed in parts of the picture where it is least visible. However, 
it does result in some loss of coding efficiency on statistical grounds. To 
compensate at least partially for this, we also discuss the differential 
coding of the first transform coefficient H\. 

In Fig. 1 the definition of the Hadamard coefficients for the block size 
2 X 2 X 2 is given. H\ is the sum of the element brightnesses within the 
block. H 2 is the sum of the line differences within the block. H 3 is the 
planar difference. H 4 is the sum of the element differences within the 
block. It may be noted that for stationary pictures, H 5 , H e , H 7 , and H 8 
are all zero; further, any noise added to the first four coefficients gets 
repeated in the reconstructed signal at half the frame rate due to the 
block structure. As mentioned earlier, in coding frames of a single picture 
scene, the "nonmoving" noise patterns are, in general, less annoying than 
the moving noise patterns normally encountered in a television system 
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Fig. 1 — Definition of Hadamard coefficients. The pel positions are A, B, C, D, E, F, G, 
H. The Hadamard coefficients are //,, H 2 , Hz, H 4 , #5, Ha, H-„ H s . 



and, for this reason, a system was built to give a more realistic repre- 
sentation of television coding impairment. This system is described in 
Section II. 

Our method for determining the visibility functions of the noise in H 2 , 
H3, and H 4 consists of the following. We add //-noise (which simulates 
the quantization noise) to a coefficient whenever its magnitude exceeds 
a threshold. This is done because each of these coefficients consists of 
difference quantities of pels and, therefore, may be expected to mask 
the noise as some function of their amplitude. For the DPCM coding of 
//1, we add //-noise whenever the magnitude of the difference of Z/i from 
its previous block value is higher than a threshold. Again, this difference 
of H\ can be taken as a measure of signal busyness. The effect of this 
//-noise impairment on the picture is then compared by the subject in 
an A-B test with simple additive white noise impairment of the picture. 
This method of judging pictures is similar to the one used by Candy and 
Bosworth. 20 The experimental method is discussed in detail in Section 
II. 

1.3 Summary of results 

The visibility functions for the following conditions have been mea- 
sured: // 2 -noise as a function of \H 2 \ ; //3-noise as a function \Hs\ ; H 4 - 
noise as a function of | H 4 | ; and Z/i-noise as a function of | A//i | , where 
A//i is the adjacent block difference in the horizontal direction. 

The study of these visibility functions indicates that H% is the least 
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important coefficient and can be dropped entirely with little impairment. 
PCM quantizers with minimum mean-square subjective distortion 
(MMSSD) have been designed for H 2 and H 4 coefficients, and an MMSSD 
DPCM quantizer has been designed for Hi using the corresponding vis- 
ibility functions as the fidelity criteria. These quantizers have been 
implemented and have been compared in subjective tests with the cor- 
responding quantizers optimized with respect to the minimum mean- 
square error criteria. Details of this approach and the results are given 
in the subsequent sections. 

II. EXPERIMENTAL SYSTEM 

The experimental system described in this section has been designed 
with considerable flexibility as a vehicle for future research. The system 
has real-time capabilities for adaptive and nonadaptive Hadamard 
transform coding of a 2 X 2 X 2 block of pels. 

2.1 System block diagram 

A block diagram of the experimental system is shown in Fig. 2. The 
video signal is generated by a vidicon camera scanned with 271 lines 
interlaced 2:1. The video signal has a bandwidth of 1 MHz and is sampled 
at the Nyquist rate. Each picture sample is PCM encoded with amplitude 
accuracy of 8 bits per pel. 

A frame memory is incorporated in the system to accommodate the 
transform block. Alternate frames of the digitized pictures, say the odd 
frames, are stored in the frame memory via data select switch 1. Memory 
1 consists of two line delays and four small delays for linking the data 
from the present and previous frames. It ensures, during even frames, 
simultaneous presentation of all the elements from the two frames that 
comprise the data block to the Hadamard transform logic. It may be 
noted that the system is designed for spatially overlapped block pro- 
cessing. The output corresponding to nonoverlapping blocks is selected 
by memory 2 for decoding and experimentation. The spatially overlap- 
ping blocks are suitable for the study of various kinds of predictive en- 
coders. This facility is also very useful for a flicker-free display of the 
Hadamard coefficients on a television screen. 

The Hadamard transform circuit is a serial and parallel combination 
of adders and subtracters to implement the canonic forms shown in Fig. 
1. In the processor circuit, the magnitudes of all the coefficients are 
rounded off to the eight most significant bits, which are used for further 
processing. This rounding off does not produce any visible impairment 
on inverse transformation. Capabilities exist in the processor circuit to 
insert eight independent quantizers, one for each coefficient. 

For the subjective experiments, the processor circuit permits two 
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modes of operation controlled by the A/B switch. The details of the 
subsystem are shown in Fig. 3. The A/B switch is under the control of 
the subject. In the A mode, unimpaired coefficients are fed to the in- 
verse-transform circuit. This provides the original picture in the re- 
constructed signal domain. In the B mode, a controlled amount of 
pseudo-random noise is added to one of the coefficients only when the 
magnitude of a control signal (which is the coefficient itself in this di- 
agram) exceeds some reference threshold. This noise, which we call the 
//-noise, is generated at the sample rate by an 8-bit pseudo-random 
generator having a period of 2 15 words and which is not synchronized 
with the line or the frame rate of the picture. An amplitude limiter 
controls the magnitude of the noise to the level set by the experimenter. 
The sign bit for the noise word is obtained from the output of a white 
noise source, and has equal probability of being a "0" or a "1". 

Since the addition of pseudo-random noise results in doubling of the 
maximum amplitude of the noisy coefficient, the sum of the coefficient 
and noise, and the other coefficients, are divided by 2 prior to inverse 
transformation to prevent overload. 

The inverse transformation network is similar to the transformation 
network and is used to reconstruct simultaneously all of the pels of the 

block. 

It may be recalled that the alternate frames (odd frames) of the input 
are stored in the frame memory via data select switch 1. The recon- 
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Fig. 3— Noise adding circuits. Coefficients are divided by 2 to prevent overload in the 
inverse transform function after noise addition. 
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Fig. 4 — Original picture used for subjective tests. 



structed pels corresponding to the even frames are stored in the proper 
time slots in the frame memory by data select switch 1. Thus, the frame 
memory contains both processed and unprocessed data and is utilized 
fully. Data select switch 2 ensures that the reconstructed pels corre- 
sponding to the even frames that are stored in the frame memory are fed 
to the digital-to-analog converter in the proper time sequence. 

The original picture used for the subjective tests is shown in Fig. 4. 
The scanned and filtered version (by a 1-MHz Picturephone® filter) is 
shown in Fig. 5a. Figure 6 shows the picture of the coefficients using 
overlapping blocks. Figure 6a shows coefficient Hi, which is essentially 
a "block-low-pass-filtered" version of the picture and preserves much 
of the picture information. On the other hand, Figs. 6b (#2 coefficient), 
6c (H3 coefficient), and 6d (H 4 coefficient) show a variety of edge in- 
formation. 

2.2 Experimental details 

The experimental setup for determining a visibility function is shown 
in the simplified block diagram in Fig. 7. The experimenter adds H-noise 
to a selected coefficient whenever the absolute value of the coefficient 
exceeds a threshold. The amount of noise and the threshold are varied. 
This is presented as condition B to the subject. Condition A is the un- 
impaired picture plus white noise. By turning an attenuator knob, the 
subject can control the amount of white noise added to the unimpaired 
picture. He can switch between conditions A and B by the A/B switch 
provided. An experiment consists of the subject changing the attenuator 
until he finds the pictures in the switch positions A and B to be subjec- 
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Fig. 5— Filtered test picture and H-noise added pictures, (a) Filtered test picture (1-MHz 
Picturephone 9 filter), (b) Picture with noise added to H\. (c) Picture with noise added 

to H>.. (d) Picture with noise added to H\. 



tively equivalent. The subject can switch between A and B conditions 
as often as he likes and can look at the test conditions as long as he likes. 
When he arrives at the subjective equivalence, he gives the attenuator 
reading to the experimenters on an intercom. He is then given the next 
test condition. In one sitting, a subject makes 28 judgments of which the 
first four are considered as training. The remaining 24 are recorded as 
data. The experiment is also characterized by the following: 

(i) The picture has 271 lines, interlaced 2:1 at 30 frames per second. 
(ii) The visible portion of the picture is about 13 cm X 12 cm. 
(Hi) High light brightness is 74 foot-lamberts. 
(iu) Low light brightness is 4 foot-lamberts. 
(u) Room illumination is 57 foot-candles. 

The scan lines of the Conrac monitor were broadened to correspond to 
the Picturephone display tube. Subjects were seated at a distance of 
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Fig. 6— Pictures of coefficients, (a) Picture of Hi coefficient (with no output filter), (b) 
Picture of H-> coefficient (with output filter), (c) Picture of H :l coefficient (with output 
filter), (d) Picture of tf 4 coefficient (with output filter). 



about 80 cm from the monitor. All of the six subjects used had experience 
in judging coded television pictures. 

III. TEST DATA AND ANALYSIS 

Results of a typical subjective test are shown in Fig. 8. In this case, the 
absolute value of H 4 was compared to a threshold and the noise (// -noise) 
was added to H 4 . In this figure, // 4 -noise is plotted in dB on the X-axis 
and the "equivalent white noise" is plotted on the Y-axis. Each datapoint 
is an average of the readings obtained from six subjects. Under the as- 
sumption that the equivalent white noise (V\v) is proportional to the 
//-noise ( V7/), the results for each threshold should fall on a 45-degree 
straight line. The lines drawn in Fig. 8 are the best unity-slope straight 
lines obtained by the least square fitting to the datapoints. Figures 9 and 
10 show similar data for H^, and H\, respectively. In the case of i/i, | A//i| 
is compared to a threshold. Notice that in each case the quantity that 
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Fig. 8 — Plot of "equivalent white noise" vs H 4 noise for different thresholds on \H 4 \ . 



is compared to a threshold is a measure of busyness of the picture in a 
local area. The locations in the picture where noise is added and its ap- 
pearance are dependent upon the quantity that is compared to a 
threshold and the coefficient to which the noise is added. 

The pictures with if -noise impairments are shown in Fig. 5. The H- 
noise added in each of the three pictures has a peak value of 100 units* 
(signal range is to 255 units). In Fig. 5c, noise is added to H i in all blocks 
which have I//2I more than five units, whereas in Fig. 5d, noise is added 
to all blocks in which |H 4 | is more than five units. While Hi is the line 
difference, Fig. 5c has noise whenever an edge has a sufficiently large 
component along the horizontal direction, whereas Fig. 5d has noise 
whenever an edge has a sufficiently large component along the vertical 
direction. Also notice the difference in the appearance of the noise. 
//2-noise is much more noticeable than .ff^noise. Figure 5b shows noise 



* This is much more than the noise used for any test condition, but has been used to 
demonstrate the effects in a photograph. 
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Fig. 9— Plot of "equivalent white noise" vs H 2 noise for different thresholds on \H 2 \. 
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Fig 10— Plot of "equivalent white noise" vs Hi noise for different thresholds on 
|AH,1. 
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of 100 units added to Hi whenever | AHi| is more than 10 units. Here 
again, noise gets added to all blocks having horizontal interblock edges. 
Also the pattern generated by Hi noise is much more objectionable than 
the pattern generated either by //2-noise or H-noise. 

The relationship of proportionality between the equivalent white noise 
and the H-noise for a threshold z is written in the form: 

V w = F(z)V H , (1) 

where 2 can take on the value of | H 2 \ , | H 3 \ , | H 4 \ , or | AH 1 1 . The constant 
of proportionality F(z) is the equivalent white noise power when a unit 
H-noise is added to the particular coefficient for all blocks of the picture, 
where the magnitude of the corresponding coefficient ( | AHi | in the case 
of Hi) is greater than or equal to the threshold z. We next assume the 
additivity of the equivalent white noise power with respect to the coef- 
ficient value; i.e., if the equivalent white noise power when a unit of 
H-noise is added to H 2 and T] ^ |H 2 | < T 2 is V w i, and the equivalent 
white noise power when a unit of H-noise is added to H 2 and T 2 ^ |H 2 | 
< T 3 (Ti < T 2 < T 3 ) is V u ,2, then the equivalent white noise power when 
a unit of H-noise is added to H 2 and T x ^ \H 2 \ < T 3 is (V wl + V w2 ). 
Under this assumption, F(z) can be written as an elemental sum of the 
equivalent white noise powers. Thus, 



F(z)= j~ f(x)dx, 



(2) 



where f(z) is called the visibility function. 

Using this procedure, visibility functions were computed. They are 
shown in Fig. 11. Notice that we have assumed that the occurrences of 
positive and negative coefficients ( AHi, H 2 , H3, H 4 ) are similar, and the 
noise visibility does not depend upon the sign of the coefficient. This 
results in the visibility functions being symmetrical about zero. The value 
of the visibility function shows the relative importance of the various 
transform coefficients. The larger the value, the more important is the 
coefficient. In general, the visibility functions decrease as a function of 
their arguments. This is a combined effect of several factors, such as (i) 
the decrease in the number of blocks having large coefficient values 
(| AHi| in the case of Hi), (ii) the dependence of the perception of noise 
on the magnitudes of the coefficients (which correspond to the sharpness 
of the boundary in the pel domain), and (Hi) the contextual importance 
of the specific regions of the picture. 

Psycho-visual techniques which measure the detectability of per- 
turbations in the neighborhood of edges, 18 - 21-23 and the just noticeable 
differences in the amplitudes of edges have been widely applied to DPCM 
coding. Since these deal with over-simplified stimuli and surround and 
are almost always detection experiments, their use in picture coding may 
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not always result in better coders. In any case, these techniques cannot 
be easily applied in the transform domain because we are not dealing 
with the single pels but with blocks comprising pels from more than a 
single line and frame. Also, the perturbations must be introduced in the 
transform coefficient, whereas the annoyance to the perturbations must 
be judged in the pel domain. 

Our approach, which obtains the visibility functions as outlined above, 
has the following limitations: 

(i) Since the visibility functions are tied to the picture content, they 
admittedly vary from picture to picture, especially if the picture content 
is changed significantly. They also depend upon the class of viewers and 
the viewing conditions. Thus, any optimization based on the visibility 
functions is strictly applicable to a restricted situation. This is under- 
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Fig. 11— Plot of visibility functions. Notation Hi, Hj indicates noise added to Hi, when 
Hj is thresholded. The Hj threshold level is shown on the X-axis. 
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standable since the human perception does indeed vary with the picture 
content, the viewing conditions, and the particular viewer. We demon- 
strate in Section IV that the results we obtained using these visibility 
functions are not overly sensitive to the picture content and are rea- 
sonable for a class of pictures rather than a particular picture. 

(ii) The simulation of the quantization noise by the H -noise is fairly 
accurate for H 2 , #3, and H 4 . However, in the case of Hi, DPCM tech- 
niques are used to code AH 1. The changes in the appearance of noise as 
a function of threshold are not completely reflected in the measurements; 
i.e., while the noise that is added to Hi does look like granular noise at 
low thresholds, it does not look like slope overload at high thresholds. 
Also, at high thresholds, the noise is added to fewer blocks in the picture 
and the appearance of such an impaired picture is different from the 
appearance of a white noise impaired picture. Therefore, in some cases 
subjective equivalence is hard to achieve. 

(Hi) It would be better if the perceptual, statistical, and contextual 
effects were explicit in the visibility function and could be controlled 
separately. Unfortunately, such is not the case. 

(iv) It is seen from eq. (2) that the process of obtaining the visibility 
function involves differentiation of the data, which is known to introduce 
some noise. By adding H-noise to a coefficient when the quantity to be 
compared to a threshold is within a small range of values, it is possible 
to avoid this differentiation. 



IV. RESULTS 

In this section, we present certain conclusions drawn from the visibility 
functions and then describe their application to the design of quantizers 
for the coefficients. Visibility functions shown in Fig. 1 1 clearly show 
the relative importance of various coefficients. H\ is the most important, 
Hi is the next, followed by H4, and H3 is the least important. The visi- 
bility of H noise depends upon the patterns associated with a particular 
coefficient. These patterns depend upon the inverse transform and are 
shown in Fig. 12 for Hi, Hq, and H 4 , respectively. In each case, noise of 
a given amount is added to one of the coefficients, and the background 
is assumed to be flat. The higher the spatial frequency of the pattern, 
the lower the visibility of the noise. Thus, Hi noise is more visible than 
H 4 noise because the interlace gives the Hi noise pattern lower spatial 
frequency than the H4 noise pattern. 

4. 1 Visibility of H, noise 

An experiment was performed to utilize the well-known property of 
the human eye that the brightness discrimination decreases as the 
brightness level increases, called Weber's law in the psycho-visual lit- 
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Fig. 12— Pictures of noise patterns for coefficients, (a) Picture of noise added to H\ on 
a flat background, (b) Picture of noise added to H 2 on a flat background, (c) Picture of 
noise added to H A on a flat background, (d) Picture of noise added to H A on a flat back- 
ground. 



erature 24 - 26 (see Ref. 27 for a recent application of Weber's law for pic- 
ture coding in the pel domain). 

In this experiment, noise was added to Hi as a function of Hi, since 
Hi corresponds to the average brightness in the block. The results of this 
subjective experiment showed large variations from observer to observer. 
When the data for the observers was averaged, there was no significant 
variation in the visibility of noise as a function of Hi. This could be due 
to the following: (i) If the gamma of the display tube used was not unity, 
it would have partially compensated for Weber's law effects, (w) We were 
working with the head and shoulders view of a person. In general, for such 
a picture, the highlights are on the forehead or the cheek of the person. 
These regions are contextually very important causing the visibility of 
noise to be high. (Hi) The picture we used was such that the low-light 
areas had more spatial detail than the highlight areas; thus, the latter 
two effects may have compensated for the Weber's law. 
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Measurements of the gamma of the monitor indicate that the visibility 
function in this case cannot be fully explained on the basis of the com- 
pensation of Weber's law by the gamma of the display tube. It seems that 
at least for this class of pictures, namely the head and shoulders view of 
a person, the advantage that could be gained by the utilization of Weber's 
law is compensated for by the other effects. 

4.2 "Frozen" vs "unfrozen" noise visibility 

It may be recalled that the experiment on visibility was done using 
a block size of 2 X 2 X 2. In this case, any noise added to the first four 
coefficients remained unchanged for two frames. The noise in the coef- 
ficients in this case may be called the "frozen" noise because it remains 
unchanged for two frame periods. An experiment was performed to de- 
termine the visibility functions for H4 coefficient for a block size of 2 X 
2 (horizontal-vertical). Since all the experiments have been carried out 
with a stationary picture, the only difference between the experiment 
with the block size of 2 X 2 and a block size of 2 X 2 X 2 is the coefficient 
noise. For the block size of 2 X 2, the coefficient noise changes from frame 
to frame and is called "unfrozen" noise. Figure 13 shows the visibility 
functions for H4 with "frozen" and "unfrozen" noises. Although the 
visibility functions of "unfrozen" noise are generally a little lower than 
that of "frozen" noise, due to lower temporal frequency, the differences 
are small. 

In Section III, it was mentioned that the visibility functions can be 
used as fidelity criteria for the design of quantizers. We describe below 
how these results are used to design quantizers. 

4.3 pcm coding of H 2 , H 3 , and H 4 

It is assumed that little interaction exists between the Hadamard 
coefficients, so that the quantization transfer characteristics for the 
coefficients can be obtained independent of each other. It is recalled from 
Fig. 11 that Hz was the least important coefficient and, therefore, it was 
decided to drop the transmission of H3 altogether. 

Minimum mean-square error quantizers are obtained by minimizing 
the mean-squared quantization error. If N is the number of levels, and 
Ph, (•) is the histogram for \Hk\, then we minimize the distortion D given 
by' 

D=t C Xi+ \\H k \-Yj)*P Hk (\H k \)d(\H k \), A = 2,3,4, (3) 

with respect to \Xj\, j = 2, . . ., N and \Yj\,j = 1, . . ., N. This gives us the 
well-known Max quantizer. 28 MMSE quantizers are obtained by 
weighting the quantization error according to the frequency of its oc- 
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Fig. 13 — Visibility function for "frozen" and "unfrozen" H4 noise. 



currence. Minimum mean-square subjective distortion quantizers, on 
the other hand, weight the quantization error according to its subjective 
visibility. This can be achieved by substituting fn k (-) for P//,,(-) in the 
expression for the distortion. The term f Hk (-) is the visibility function 
for the coefficient H k . Standard programming techniques were used to 
minimize the distortion D in both cases. 

The histograms for | AHi | , \H 2 \ , \H 3 \ , and \H 4 \ are shown in Fig. 14. 
In general, these decrease faster than the visibility functions. This is 
exemplified in Fig. 15 in which the histogram and the visibility function 
for A#i are plotted with the same scale on the X-axis. We shall see later 
that this fact results in larger companding of the MMSE quantizers than 
the MMSSD quantizers and, consequently, poor reproduction of busy 

areas of a picture. 

Typical quantizer characteristics are shown in Fig. 16. The MMSE 
quantizer is more companded than the MMSD quantizer. Note also that 
the dynamic range of the MMSE quantizer is smaller. The performance 
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Fig. 14 — Histograms of coefficients. All the histograms are assumed to be symmetric 
about zero. 

of these two types of quantizers (MMSSD and MMSE) was compared in 
an A-B test with different numbers of levels. Figure 17 shows the results 
of such a test for the coefficient H 4 . In this test, MMSE quantizers with 
levels 5 to 8 were compared in terms of picture quality with MMSSD 
quantizers with levels 3 to 9 using a random pairing by six skilled 
subjects. The numbers in the table indicate the percentage of observers 
who preferred the MMSSD quantizers over the MMSE quantizers. The 
picture coded with the 5-level MMSSD quantizers was preferred by 100 
percent of the subjects over the 6-level MMSE quantizer. Figure 18 shows 
similar comparisons for the quantization of H<i- Here again, for the same 
number of levels, the picture quality using the MMSSD quantizers is al- 
ways better than using MMSE quantizers. Moreover, picture quality using 
the 6-level MMSSD quantizer and 7-level MMSE quantizer is equivalent. 
Figures 19 and 20 show the entropy of the quantized output using both 
the MMSSD and the MMSE quantizers having levels 3 to 8 for H\ and Hi, 
respectively. In the case of H 4 , the difference between the entropies of 
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Fig. 15— Comparison of probability density (P) and visibility ( V) for Hi noise. 



the output of the MMSE and the MMSSD quantizer for the same number 
of levels is about 0.2 bit. Since the picture quality with the 7-level visi- 
bility quantizer was better than with the 8-level MMSE quantizer, the 
gain by the use of the MMSSD quantizer is of the order of 0.5 to 0.6 bit 
for the transmission of H 4 . Similar remarks can be made about the 
quantization of H i. 
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Fig. 17 — Comparison of picture quality of MMSSD and MMSE quantizers of different 
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levels for H4. 



4.4 Coding of H A 

Unlike the Hi, H3, and H\ coefficients, the H\ coefficient is not a 
difference signal. It represents the average brightness within the block 
and thus carries the low-frequency information which should be coded 
relatively precisely. Uniform PCM coding of H\ requires 7 to 8 bits for 
good picture quality. As mentioned before, efforts to compand the PCM 
quantizer by using the Weber's law effect were not very successful. 
Therefore, it was decided to DPCM encode H\. Since the block size used 
is small, there is substantial correlation between the H\ values of adja- 
cent blocks. This was exploited by using a DPCM coding of Hi with ho- 
rizontally adjacent blocks for prediction. The quantizers for such a DPCM 
coder are obtained from the visibility function of H 1 under the control 
of \&H\\ in a manner similar to the above by minimizing the mean- 
square subjective distortion due to the quantization noise. The resulting 
quantizer scales are companded due to the monotonic decrease of the 
visibility function with respect to AHi, as shown in Fig. 11. Quantizer 
scales have also been obtained by minimizing the mean-square quanti- 
zation error.* As noted before, MMSE quantizer scales are more com- 
panded and have less dynamic range compared to the MMSSD quantized 
scales. Using these two types of scales, experiments have been performed 



* Although the visibility function and the histograms are obtained from the difference 
signal I A//i J , and the quantity that is quantized is the differential signal (i.e., the difference 
between the present Hi and the coded value of H\ from the previous block), it is expected 
that the quantizer characteristics will not change appreciably by using difference instead 
of the differential signal in eq. (3). 
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Fig. 19— Plots of entropy of outputs of MMSSD and MMSE quantizers of different levels 
for H 4 . 
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Fig. 20 — Plots of entropy of outputs of MMSSD and MMSE quantizers of different levels 
for H 2 . 

to compare the picture quality for the same number of levels. The results 
of such a comparison are shown in Fig. 21. It is seen that, for the same 
number of levels, all the subjects preferred the picture coded with the 
MMSSD quantizers over the picture coded with MMSE quantizers. 
Moreover, picture quality using a 24-level MMSSD quantizer is equivalent 
to the picture quality using a 30-level MMSE quantizer. Entropies of the 
quantized signal with MMSSD and MMSE quantizers of different levels 
are shown in Fig. 22. Here again, visibility quantizers perform better than 
MMSE quantizers by about 0.35 bit per block for the same number of 
levels. Picture quality using a 24-level MMSSD quantizer can be produced 
by an MMSE quantizer with an increase in entropy of 0.6 bit per block. 
It is worth noting tht, due to the DPCM coding of Hi, the bits required 
for Hi could be almost halved. However, Hi still remains more important 
than H2 and H 4 and requires more bits for satisfactory transmission. 
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Fig. 21 Comparison of picture quality of MMSSD and mmsk quantizers of different 

levels for DPCM coding of Hi. 

4.5 Combined quantization of all coefficients 

Combined quantization of all the coefficients requires investigation 
of the optimal number of quantizer levels to be given to each one of them. 
In the case of gaussian random vectors using Karhunen-Loeve trans- 
formation and mean-square error criterion, optimal bit allocation for 
the various transform coefficients is well known. 29 - 30 However, in our 
case, none of these assumptions are strictly valid. In fact, our assumption 
that the optimum quantizer characteristics for different coefficients can 
be obtained independently is not strictly true and, for this reason, we 
tried to evaluate the picture quality by quantizing all the coefficients. 
By trial and error, a near-perfect picture was produced by using 36, 13, 
and 7 quantization levels for Mi u H 2 , and H 4 respectively, and by 
dropping H 3 . This resulted in a total entropy of about 2.17 bits per pel. 
In single-frame photographic reproduction, no difference could be ob- 
served between the coded picture and the low-resolution original shown 
in Fig. 5a. Several other "head and shoulders" type of pictures were coded 
using the same combination of levels. Although, in each case the picture 
appeared to have a reasonable quality, the visibility of the quantization 
and the resulting picture quality varied slightly. This implies that the 
quantizers we obtained by optimizing the visibility of the quantization 
noise for one particular picture were not overly sensitive to variation in 
picture content. 



V. CONCLUSIONS 

A systematic method for quantizing Hadamard coefficients has been 
given. This method gives the best quantizers in a subjective and proba- 
bilistic sense. We have compared the resulting quantizers with MMSE 
quantizers and found the MMSSD quantizers to be better both in terms 
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Fig. 22 — Plots of entropy of outputs of M MSSD and MMSE quantizers of different levels 
for DPCM coding olH\. 

of the subjective picture quality and entropy. We do not imply that there 
are no better quantizer than the MMSSD quantizer, since by taking many 
other factors into consideration, one could come up with a better 
quantizer. We do find that the minimum visibility quantizers are opti- 
mum with respect to our model and the approach used for weighting the 
quantization noise. 

Investigations are in progress for adaptive and predictive coding of 
the coefficients; our findings will be reported in a future paper. 31 
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